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Abstract 

Refactoring is an activity that improves the internal 
structure of the code without altering its external be¬ 
havior. When performed on the production code, the 
tests can be used to verify that the external behavior 
of the production code is preserved. However, when 
the refactoring is performed on test code, there is no 
safety net that assures that the external behavior of 
the test code is preserved. 

In this paper, we propose to adopt mutation testing 
as a means to verify if the behavior of the test code is 
preserved after refactoring. Moreover, we also show 
how this approach can be used to identify the part of 
the test code which is improperly refactored. 

1 Introduction 

Refactoring is “the process of changing a software 
system in such a way that it does not alter the ex¬ 
ternal behavior of the code yet improves its internal 
structure” [7]. If applied correctly, refactoring im¬ 
proves the design of software, makes software easier 
to understand, helps to find faults, and helps to de¬ 
velop a program faster [7]. However, the process of 
refactoring is not always performed flawlessly [Mill], 
leading to faults being introduced into the refactored 
code due to mistakes made by developers, or using 


automated refactoring tools that do not preserve the 
behavior of the code. Thus, there is a need for a 
safety net that saves developers when they refactor 
improperly the test code. 

Refactoring does not only target the production 
code, but it also actively involves the test code. Ide¬ 
ally, in object-oriented systems, for each production 
class, we have a related counterpart in the test sec¬ 
tion. As a consequence, the size of the test suite in¬ 
creases linearly with the size of the system. This sce¬ 
nario is particularly common in software systems de¬ 
veloped using test-driven development, since it leads 
to a rapid development of test suites [2] . In this con¬ 
text, it is important to also refactor the test code to 
keep it synchronized with the evolution of the pro¬ 
duction code and avoid its quality erosion [2T]. 

Refactoring of the production code can be done 
with less risks using a test suite, since it provides a 
safeguard against regressions during software trans¬ 
formation . Tests ensure that the production code 
preserves its external behavior pre- and post- refac¬ 
toring. On the contrary, there is no widely-accepted 
method to verify if a refactored test suite preserves 
its external behavior. Several studies point out the 
peculiarities of test code refactoring [231 Ull 111 HD] • 
However, none of them provided an operative method 
to guarantee that such refactoring was preserving the 
behavior of the test. To address this shortcoming, we 
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propose the adoption of mutation testing as a safety 
net for test code refactoring. 

Mutation testing provides a repeatable and scien¬ 
tific approach to measure the quality of the test code. 
It consists of two phases: First, generating faulty 
versions of the code by injecting a single fault into 
the code (creating a mutant) and then, executing the 
test suite on this faulty version of the code to deter¬ 
mine the outcome. The output of mutation testing is 
a quality metric (mutation coverage) defined by the 
percentage of the faults that resulted in failure of at 
least one test (killed mutants) by the total number 
of created mutants. This metric is proven to simu¬ 
late the faults realistically mm- This is due to the 
fact that the faults introduced by each mutant are 
modeled after the common mistakes developers often 
make [S]. 

A correctly performed refactoring of the test code 
should not change its external behavior, and conse¬ 
quently its mutation coverage should remain unal¬ 
tered. For this reason, we propose to calculate muta¬ 
tion coverage of each class in the production code pre- 
and post- refactoring of the test code. The compari¬ 
son between both reveals whether the refactoring had 
any effect on the tests covering that class. Moreover, 
this approach points out the location of the injected 
faults helping to spot easily which part of the test 
code was improperly refactored. 

To validate our approach we run the experiments 
on two projects. The first project is a simple system 
created ad hoc to show how mutation testing is capa¬ 
ble of identifying a change in the external behavior of 
a refactored test. The second project is used to ver¬ 
ify our approach in a real open source system with a 
refactored test suite. 

The paper has the following structure. Section 
reports the background notions related to mutation 
testing. Section describes the research approach 
adopted. Sectioiiffl discusses the results of our re¬ 
search. Section [^presents the threats to validity. 
Section [^reports the related work. Finally, section 
summarizes our findings. 


2 Background 

This section describes the typical quality metrics of 
the test code, and background information related to 
mutation testing. In addition, it provides an overview 
of the implementation of mutation testing in Lit- 
tleDarwirQ 

2.1 Simple Coverage Metrics 

There are simple coverage metrics available to esti¬ 
mate the quality of a test suite [55]. Statement cov¬ 
erage determines the percentage of executed state¬ 
ments by test code. In a similar fashion. Branch cov¬ 
erage determines the percentage of the branches of 
code that are executed by the test code. A branch is 
created in a program when a control statement (e.g. 
if or switch statements) provides two or more paths 
of execution. These metrics provide an overview of 
the quality of the test suite in an easily attainable 
manner; Yet, they are inadequate in their purpose of 
estimating quality [26]. Even a 100% branch coverage 
would leave a lot of room for a fault to escape m- 
Furthermore, branch coverage is also a poor measure 
to determine a detailed map of the weaknesses in a 
test suite because first, it lacks the ability to discover 
which type of faults are being caught, and which are 
not; and second, it is difficult for practical tools to 
trace the execution paths during the runtime of com¬ 
plicated software systems. Thus, these metrics are 
not adequate enough to discover small mistakes in 
the test code, and to trace back the change in behav¬ 
ior to the faulty code. 

2.2 Mutation Testing 

Mutation testing is the process of injecting faults 
into software, and counting the number of intentional 
faults which make at least one test fail. The idea of 
mutation testing was first mentioned in a class paper 
by Lipton m and later developed by DeMillo, Lip- 
ton and Sayward [^. The first implementation of a 
mutation testing tool was done by Timothy Budd in 
1980 [3]. This procedure is executed in the follow¬ 
ing manner: First, faulty versions of the software are 

^ http://littledarwm.parsai.net/ 
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Figure 1: Mutation testing procedure 


created by introducing a single fault into the system 
(Mutation). This is done by applying a known trans¬ 
formation on a certain part of the code (Mutation 
Operator or Mutator). The more mutants generated 
for a class, the more chance that we detect a change 
in behavior. After generating the faulty versions of 
the software (Mutants), the test suite is executed on 
each one of these mutants. If there is an error or fail¬ 
ure during the execution of the test suite, the mutant 
is regarded as killed. On the other hand, if all tests 
pass, it means that the test suite could not catch the 
fault and the mutant has survived. This procedure 
demands a green test suite —a test suite in which all 
the tests pass— to run correctly. An overview of this 
procedure can be observed in Figure 


Mutation Coverage 


Killed Mutants 
All Mutants 


( 1 ) 


The final result is calculated using Equation 
This metric provides a more detailed image of the 
quality of a test by emphasizing test results. This 
makes sure that the kind of faults simulated by mu¬ 
tation operators are covered by the test; Therefore 
reducing the chance of missing such faults in the fi¬ 
nal product. 


2.3 LittleDarwin 

LittleDarwin^ is a mutation testing tool created by 
the first author. This tool is designed to offer an 
alternative framework for those who need to apply 


Operator 

Description 

Example 

Before j After 

AOR-B 

Replaces a binary arithmetic operator 

a + b 

a — b 

AOR,-S 

Replaces a shortcut arithmetic operator 

+ + a 

- a 

AOR,-U 

Replaces a unary arithmetic operator 

—a 

+a 

LOR 

Replaces a logical operator 

ak.b 

a 1 b 

SOR 

Replaces a shift operator 

a » b 

a « h 

ROR 

Replaces a relational operator 

a >= b 

a <b 

COR 

Replaces a binary conditional operator 

a SzSz b 

a 11 6 

COD 

Removes a unary conditional operator 

! a 

a 

SAOR 

Replaces a shortcut assignment operator 

a* = b 

a 1 — b 


Table 1: LittleDarwin mutation operators 


mutation testing to complex systems; but it is capa¬ 
ble of analyzing simple systems as well. LittleDarwin 
is designed to be independent from the testing struc¬ 
ture. As a result, LittleDarwin demands much less 
compatibility from the target system in order to per¬ 
form its analysis. Thus, it can be run on any build 
structure no matter how complex it is, given following 
conditions: 

1. The build process must be able to run the test 
suite. 

2. The build process must return non-zero if any 
tests fail, and zero if it succeeds. 

3. The build process must be sufficiently fast in or¬ 
der to keep the total run time practical. 

LittleDarwin is designed in an expandable way, so 
that interested developers can develop their own mu¬ 
tation components and still use the structure of the 
main software to run the mutation analysis. This 
broadens the scope of its applicability. In its cur¬ 
rent version, LittleDarwin supports mutation testing 
of Java programs. 

In total, there are 9 mutation operators imple¬ 
mented in LittleDarwin. These mutators are collec¬ 
tively known as the minimal set. The description of 
each mutator along with an example can be found in 
Table m 
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3 Experimental Setup 

As described before, the problem we try to solve is 
to detect improper refactoring. So, we aim to pro- 





































Figure 2: UML class diagram of the toy project 


vide an operative method to verify whether or not 
the refactoring activity has changed the external be¬ 
havior of the test suite. To achieve this, we examine 
two cases: 

• A use case where an ’improper’ refactoring of the 
test code changes the testing behavior. 

• A use case where a ’proper’ refactoring of the 
test code does not change the testing behavior. 

Two projects are selected as our cases. First, we 
create a toy project to exhibit the ability of muta¬ 
tion testing to highlight a change in the behavior of 
the test suite. We use the same project to demon¬ 
strate the usage of mutation testing to identify the 
improperly refactored part of the test code. Second, 
we use an open source project to verify how our ap¬ 
proach can be applied on real refactorings that affect 
the test code. 

Each project has two versions: pre- and post- refac¬ 
toring. For each version, we use JaCoCcj^to calculate 
statement and branch coverage, and LittleDarwin to 
run mutation testing. 


two versions of the test code: pre- and post- refac¬ 
toring. 

The test pre- refactoring suffers from two code 
smells (1) conditional statement that checks the 
type of the input variable [7], and (2) assertion 
roulette m- In the test post- refactoring these code 
smells are removed by introducing three separate test 
methods. Here, we simulate the introduction of a 
naive mistake (Figure]^ red area): during the op¬ 
eration of copy & paste the developer did not cor¬ 
rectly adapt the method salaryManagerTest (). In 
the post- refactoring version, instead of the correct 
value of 2500, the value is set to be 1500. This mis¬ 
take is introduced to show (in Section]^ how muta¬ 
tion testing can detect behavior change and be used 
to trace back improperly refactored tests. 

3.2 Real Project 

We analyze the open source project Codecjf] Us¬ 
ing Ref-Finder m, we identify which refactorings 
were performed during its evolution and among them 
which ones affected the test code. All of them were 
manually validated. Moreover, during the manual 
inspection few other refactorings were added to the 
list. 

Table reports all test refactorings identified. As 
we can see CodecBasicsTest is involved in several 
types of refactorings. One of the relevant refactor¬ 
ings in terms of test reengineering was the extraction 


^http: / / www.eclemma.org/jacoco / 

® http: / / github.com/addthis / codec / 
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public class salaryTest { 

boolean checkPayment(Employee x) 

{ 

String type = x.getClass() .getNanie(); 

switch (type) 

{ 

case "salary.Engineer" : 

if (2000 <= X. payAmount()) return true; 
break; 

case "salary.Manager" : 

if (2500 <= X. payAmount()) return true; 
break; 

case "salary.Salesman" : 

if (1500 <= X. payAmount()) return true; 
break; 

} 

return false; 


@Test 

public void salaryEmployeeTest() { 

Employee e = new EngineerO, m = new ManagerO, s = new 

try { 

assertrrue(checkPayment(e)); 
assertTrue(checkPayment(m)); 

assertrrue(checkPayment(s)); 

System.oot. println( "Test Passed !") ; 

} catch (AssertionError err) { 

System.out. println( "Test Failed!" ); 
fail{); 

} 

} 

} 

Pre-Refactoring 


public class salaryTest { 

@Test 

public void salaryEngineerTestO { 

Employee e = new EngineerO; 

try{ 

assertrrue(2000 <= e.payAmount()); 

System. out. printIn( "EngineerTest Passed!" ); 
} catch (AssertionError err) 

{ 

System. out. printIn( "EngineerTest Failed!" ); 
fail{); 

} 


@Test 

public void salaryManagerTestO { 

Employee m = new Manager(); 
try{ 

// Erroneous assertion 

assertTrueilSQQ <= m.payAmount()); 

System. out. println( "ManagerTest Passed!" ); 

} catch (AssertionError err) 

{ 

System. out. printIn( "ManagerTest Failed!" ); 

): fain)-, 

} 

} 

(aTest 

public void salarySalesmanTestO { 

Employee s = new Salesman(); 
try{ 

assertrrue(1500 <= s.payAmount()); 

System. out. printIn( "SalesmanTest Passed!" ); 
} catch (AssertionError err) 

{ 

System. out. printIn( "SalesmanTest Failed!" ); 

faili); 

} 

> Post-Refactoring 


Figure 3: Test class pre- and post- refactoring 
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of several nested classes from CodeBasicTest, which 
was later on followed by a further extract class refac¬ 
toring (Figure]^. In this refactoring, two extra test 
classes were created to incorporate assertions related 
to their corresponding classes in the production code, 
thus eliminating assertion roulette and god class code 
smells in CodecBasicsTest. 

In the toy project the refactoring does not affect 
the production code. For this reason we were able 
to verify on the same version of the production code 
whether the refactoring of the test code preserved its 
external behavior. In a real project like Codec, refac¬ 
torings and other maintenance activities co-occur in 
production and test code. In this scenario, we can¬ 
not verify if the refactored test suite is changing its 
behavior due to production code change. For this 
reason, we had to introduce an alternative version of 
Codec in our experiment, in which the refactoring is 
restricted to the test code. To accomplish this task, 
we had to go through the history of the project and: 

1. Identify when a refactoring is performed on the 
test suite. 

2. Back port this refactoring to the previous version 
of system 

3. Create an alternative version of the system 
where the production code is the same, but the 
test code is refactored. We call this the post¬ 
clean- refactoring version. 

At the end of the process, we have two versions of 
the system: pre- refactoring and post-clean- refactor¬ 
ing. Both versions have the same production code, 
but differ in the refactoring of the test suite. These 
two versions are the ones we use to verify whether 
the test refactoring modifies its external behavior. 


4 Results 

In this section we show how mutation coverage is able 
to highlight whether or not a test refactoring causes 
a modification of its external behavior. For the toy 
project we compute mutation coverage along with 


Refactoring 

Target Class 

Instances 

Remove Parameter 

CodecGenericsTest 

1 

Add Parameter 

CodecGenericsTest 

1 

Rename Method 

CodecRWOnlyTest 

1 

Move Method 

CodecBasicsTest 

1 

Extract Nested Class 

CodecBasicsTest 

12 

Extract Class 

CodecBasicsTest 

1 

Rename Class 

CodecBasicsTest 

1 

Remove Control Flag 

CodecBasicsTest 

1 

Replace Magic Number with Constant 

CodecTest 

2 

Replace Magic Number with Constant 

CodecUtilTest 

1 

Replace Magic Number with Constant 

CodecObjectSubclassTest 

1 


Table 2: Refactorings in Codec test code 


100 100 100 100 



Pre-Refactoring Post-Refactoring 


■ Mutaion Coverage 

■ Branch Coverage 

■ Passedlests Percentage 


Figure 5: Percentage of passing tests, statement and 
branch coverage and mutation coverage for all classes 
of the toy project 


percentage of passed tests, statement and branch cov¬ 
erage to prove that these approaches are not suitable 
for detecting a change in the test behavior. For the 
Codec project, we limit our analysis to mutation cov¬ 
erage. 

4.1 Toy Project 

The toy project has a refactoring of the test code. 
One of the refactorings was improperly done: a fault 
was introduced in salaryMamagerTest. Here we 
compute the three metrics that are generally used to 
evaluated test quality: percentage of passing tests, 
statement and branch coverage and mutation cover¬ 
age. The results are presented in figure Here we 
can see: 
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• Percentage of passing tests. In both pre¬ 
refactoring and post- refactoring versions, all 


















tests passed. The wrongly refactored method 
salaryManagerTest was not detected. 

• Statement, and branch coverage. Both met¬ 
rics grant a 100% coverage for the pre- and 
post- refactoring versions. Also in this case 
salaryManagerTest was not detected as a faulty 
refactored test. 

• Mutation coverage. For the mutation analy¬ 
sis, LittleDarwin introduced two mutants in the 
production code by replacing the operators + 
with - and vice-versa (figure [^. In the pre¬ 
refactoring version, one of these mutants was 
killed, resulting in a 50% mutation coverage. 
Whereas, in the post- refactoring version both 
mutants survived, resulting in a 0% mutation 
coverage. The different mutation coverage is the 
first hint that the refactoring changed the exter¬ 
nal behavior of the test code. 

Investigating on which mutant changed the sta¬ 
tus of the test (passing to not passing or 
vice versa), we trace the problem back to 
salaryManagerTest. Finally, comparing the 
two versions of the test code we identify the 

fauhl3 


public class Manager extends Employee { 

Manager!) 

{ 

super! ); 

} 

(aOverride — 

int payAmount!) ^ 

{ J 

return defaultSalary + extra; 

} 


Figure 6: The mutant that changes the status pre- 
and post- refactoring in the toy project 


4.2 Real Project 

The Codec project presents realistic refactorings ap¬ 
plied to the test suite. In this project we do not com¬ 
pute the metrics number of passing tests, statement 
and branch coverage since these were were inadequate 
to highlight a change of behavior due to test refactor¬ 
ing. For the Codec project, by only computing the 
mutation coverage, we obtain the results in figure 
As we can see the number of mutants killed (or sur¬ 
vived) in pre- and post-clean- refactoring is the same. 
This implies that all refactorings, including the ma¬ 
jor ones, were properly performed, since they did not 
change the external behavior of the test code. 

^ The manual analysis of two version of system is practical 
for small projects where the developer has a complete under¬ 
standing of the code. In larger projects, it can be performed 
automatically using a dynamic analysis tool to find out the re¬ 
lationship between the tests and the classes of the production 
code. 



Figure 7: Mutation coverage for each class of the 
Codec project 
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5 Threats To Validity 

In this section we present the threats to validity of 
our study according to the guidelines reported in m- 

Threats to internal validity concern confounding 
factors that can influence the obtained results. In 
this study, the mutation coverage depends on the set 
of mutation operators used in mutation testing pro¬ 
cess. As a consequence, the ability of mutation test¬ 
ing in detecting behavioral changes is limited by the 
type of operators adopted. To alleviate this prob¬ 
lem, we used the standard set of mutation operators 
that most mutation testing tools support m since 
they are able to reflect the mistakes commonly in¬ 
troduced by developers. In case of Object-Oriented 
Programming languages, an additional set of muta¬ 
tion operators were designed by Ma et al. m to 
model mistakes specific to these languages. Using 
these operators would lead to more accurate results 
by introducing the support for new types of mistakes. 
This increased accuracy might be necessary in case 
of software systems that make use of Object-Oriented 
Programming structures widely throughout the code. 
Another threat stems from the masking effect pro¬ 
vided by multiple failings tests. This happens when 
a mutant is killed in the original version with the fail¬ 
ure of some tests, and the behavioral change causes 
another test to fail on the same mutant. In this case, 
the mutant would not change status, and therefore, 
the mutation coverage would stay the same. Solving 
this problem is not trivial and requires more research 
on the subject. 

Threats to construct validity focus on how accu¬ 
rately the observations describe the phenomena of 
interest. For our experiment, the elements of inter¬ 
est are (1) the ability of mutation testing in veri¬ 
fying whether test code refactoring was improperly 
performed and in that case (2) which section of the 
test code is the cause. We used a real project as well 
as our toy project to cover element (I), while the ex¬ 
periment on the toy project also contained element 
(2). Even though the toy project is very small, its 
improper refactoring is still valuable since represen¬ 
tative of a common mistake. 

Threats to external validity correspond to the gen- 
eralizability of our results. In this experiment we 


use only two projects. Although one of them was 
representative of a real open source project, actively 
maintained and developed by a commercial company, 
it is desirable to replicate this study taking into more 
projects; especially the ones where the test code was 
modified with refactorings different from the one we 
considered. 

Threats to reliability validity correspond to the de¬ 
gree to which the result depends on the used tools. 
We depend on Ref-Finder and manual inspection to 
discover the refactoring of the test code in our real 
project case. There is a possibility that we miss some 
refactorings or make mistakes in this process. We 
counter this chance by checking our list of refactor¬ 
ings against the code changes between two versions. 
We also depend on the tools JaCoCo (to calculate 
statement and branch coverage) and LittleDarwin (to 
calculate mutation coverage). The outcome of Ja¬ 
CoCo has been manually verified due to the simple 
nature of our toy project. The outcome of LittleDar¬ 
win has been tested and explored in the hrst author’s 
masters thesis m- 

6 Related Work 

Our study refers to the adoption of mutation testing 
in the context of test refactoring. For this reason we 
present the related work divided in two parts. 

6.1 Mutation Testing 

One of the articles that performs a comprehensive 
analysis of the subject is Jia et al. 2011 [8]. This ar¬ 
ticle is a literature survey that tries to summarize a 
huge amount of information about the process of mu¬ 
tation testing, performance, practicality, etc. Offutt 
et al. in m discuss the history of mutation testing 
and the state of the art, and provides insight into the 
future of the field. A good reference for analysis of 
the mutation testing tools for Java is Delahaye et al. 
2013 [S]. In the mentioned article, mutation test¬ 
ing tools for Java are compared based on efficiency, 
compatibility with current technologies and multiple 
other factors. 


Previous studies discuss mutation testing from dif¬ 
ferent point of views. However, none of them propose 
the adoption of mutation testing to analyze the be¬ 
havior preservation in the context of test refactoring. 

6.2 Test Refactoring 

The concept of refactoring and behavior preservation 
was introduced for the first time by Opdyke [T5] . 
However, this work does not differentiate between 
refactoring of the test code and refactoring of pro¬ 
duction code. Later on, several studies discovered 
and investigated the peculiarity test code refactor¬ 
ing. van Deursen et al. were the first to highlight 
the characteristics of test refactoring by providing a 
list of test code smells and test-oriented refactorings 
[23]. van Deursen and Moonen identified how refac¬ 
toring can affect the test code [23] . Counsell et al. 
extend the testing taxonomy of van Deursen using 
the inter-dependencies of the refactoring types |3]. 
Pipka proposed the Test-first Refactoring approach 
for adapting unit tests according to software changes 

m- 

All the previous does not provide a clear descrip¬ 
tion on how to refactor the tests in a safe manner. 
They lack an operative manner for verifying if test 
code refactoring modifies its external behavior. In 
our work we address this shortcoming proposing mu¬ 
tation testing as safety net for test code refactoring. 
We describe how to use mutation coverage to obtain 
an operative evaluation of the behavior preservation 
of the refactored test. 


7 Conclusion and Future Work 

Test code refactoring is an important maintenance 
activity performed to keep it synchronized with pro¬ 
duction code and avoid its quality erosion [21j . For 
the test code has been identified ad hoc refactoring 
types and peculiar design smells |23|. Nowadays, 
refactoring of the test code is riskier than the one 
performed on the production code. Indeed, the lat¬ 
ter benefits from the safety net provided by the test 
suite. On the other hand, test code refactoring does 


not have an equivalent safeguard to assure that ex¬ 
ternal behavior of the test code is preserved. 

In this paper, we propose mutation testing as a 
safety net for test code refactoring. By conducting 
the empirical experiments on two projects, we show 
that mutation testing is (1) suitable for identifying a 
change on the external behavior of a refactored test 
and (2) can be used to identify which part of the 
test code was improperly refactored. However, our 
approach is limited by the fact that the refactoring 
must be restricted to the test code. Any change to 
production code would result in a different set of mu¬ 
tants which makes the comparison between two ver¬ 
sions much harder. However, the developer can avoid 
this problem by doing the refactoring in two separate 
phases; First on the production code, and then on the 
test code. In this case, the behavioral change of the 
production code can be detected using the test suite, 
and then our method can be still applied.lt is worth 
noting that the reliability of this process to detect be¬ 
havior changes depends on the accuracy of mutation 
testing which, in turn, depends on the type of muta¬ 
tion operators that are used. A different set of muta¬ 
tion operators would lead to different mutants being 
generated, resulting in a different detection ability for 
the process. 

In this empirical study, we take into account a 
small open source project. In the future, we plan to 
extend this analysis to several other projects with a 
modified test code. In particular, we will investigate 
projects where the common test refactorings are per¬ 
formed mill]- There is a lack of empirical studies 
on evaluation of the proposed test code smells in dif¬ 
ferent different setups (e.g. industrial settings). This 
can be investigated alongside our method of detect¬ 
ing improper refactorings in these setups, quantifying 
the probability of occurrence of such code smells, and 
assessing the risks of refactoring the code to eliminate 
such smells. In addition, we plan to create a dataset 
with seeded improper refactorings that can be used 
as a test bench. 
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